gerrymandering ¶
import holoviews as hv
import geoviews as gv
import datashader as ds
import dask.dataframe as dd
from cartopy import crs
from bokeh.models import WMTSTileSource
from holoviews.operation.datashader import datashade
hv.extension('bokeh', width=95)
%opts RGB [width=1200 height=682 xaxis=None yaxis=None show_grid=False]
%opts Shape (fill_alpha=0 line_width=1.5) [apply_ranges=False tools=['tap']]
%opts Points [apply_ranges=False] WMTS (alpha=0.5)
In this notebook, we'll load data from different sources and show it all overlaid together. First, let's define a color key for racial/ethnic categories:
color_key = {'w':'blue', 'b':'green', 'a':'red', 'h':'orange', 'o':'saddlebrown'}
races = {'w':'White', 'b':'Black', 'a':'Asian', 'h':'Hispanic', 'o':'Other'}
color_points = hv.NdOverlay({races[k]: gv.Points([0,0], crs=crs.PlateCarree())(style=dict(color=v))
for k, v in color_key.items()})
Next, we'll load the 2010 US Census, with the location and race or ethnicity of every US resident as of that year (300 million data points), and define a plot using datashader to show this data with the given color key:
df = dd.io.parquet.read_parquet('../data/census.snappy.parq')
df = df.persist()
census_points = gv.Points(df, kdims=['easting', 'northing'], vdims=['race'])
Now we can datashade and render these points, coloring the points by race:
x_range, y_range = ((-13884029.0, -7453303.5), (2818291.5, 6335972.0)) # Continental USA
shade_defaults = dict(x_range=x_range, y_range=y_range, x_sampling=10, y_sampling=10, width=1200, height=682,
color_key=color_key, aggregator=ds.count_cat('race'),)
shaded = datashade(census_points, **shade_defaults)
shaded
Next, we'll load congressional districts from a publicly available shapefile and project them into Web Mercator format using GeoViews (which in turn calls Cartopy):
shape_path = '../data/cb_2015_us_cd114_5m.shp'
districts = gv.Shape.from_shapefile(shape_path, crs=crs.PlateCarree())
districts = gv.operation.project_shape(districts)
Finally, we'll define some image tiles to use as a background, using any publicly available Web Mercator tile set:
tiles = gv.WMTS(WMTSTileSource(url='https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{Z}/{Y}/{X}.jpg'))
Each of these data sources can be visualized on their own (just type their name in a separate cell), but they can also easily be combined into a single overlaid plot to see the relationships:
shaded = datashade(census_points, **shade_defaults)
tiles * shaded * color_points * districts
You should now be able to interactively explore these three linked datasets, to see how they all relate to each other. In a live notebook, this plot will support a variety of interactive features:
- Pan/zoom: Select the "wheel zoom" tool at the left, and you can zoom in on any region of interest using your scroll wheel. The shapes should update immediately, while the map tiles will update as soon as they are loaded from the external server, and the racial data will be updated once it has been rendered for the current viewport by datashader. This behavior is the default for any HoloViews plot using a Bokeh backend.
- Tapping: click on any region of the USA and the Congressional district for that region will be highlighted (and the rest dimmed). This behavior was enabled for the shape outlines by specifying the "tap" tool in the options above.
Most of these interactive features are also available in the static HTML copy visible at anaconda.org , with the restriction that because there is no Python process running, the racial/population data will be limited to the resolution at which it was initially rendered, rather than being dynamically re-rendered to fit the current zoom level. Thus in a static copy, the data will look pixelated, whereas in the live server you can zoom all the way down to individual datapoints (people) in each region.